-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor I/O and add SD_NOTIFY proxy support #182
Refactor I/O and add SD_NOTIFY proxy support #182
Conversation
@haircommander PTAL This is another take on what I had done w/ the syslog proxy. You had requested the console sockets and udp sockets share more code - I've done that by moving some flags and state into the structures used to manage the sockets. In addition this implements a SD-NOTIFY proxy using the same constructs. SD-NOTIFY conversation from here containers/podman#6688 |
e8747ba
to
4032e7c
Compare
22741d9
to
70ecb8c
Compare
I am a little worried about SELinux here, making sure that the container processes are still allowed to write the SD_NOTIFY socket. The path needs to be writeable from the container, and the listener has to be ok. |
Agreed, I need to revisit podman doing the bind mount - specifically setting the mount label appropriately. Also even though I'm creating the socket as 777, it's getting reset to 750. Haven't figured that out yet. |
libpod now relabels /run/notify |
And I get this
Do I need to change the tcontext somehow? Suggestions? |
Note @rhatdan this does work because in podman I'm setting an appropriate label on the bind mount. Tested with and without selinux, root and rootless. I did hardcode in the NOTIFY_SOCKET inside the container, if we want that to be optional - maybe an additional env variable of something like NOTIFY_SOCKET_INSIDE or something can indicate the destination? |
src/conmon.c
Outdated
@@ -84,6 +84,16 @@ int main(int argc, char *argv[]) | |||
exit(0); | |||
} | |||
|
|||
char *notify_socket_path = getenv("NOTIFY_SOCKET"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to expose this via env vs cli?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just that it's how the notify socket is usually passed from systemd et al - but podman is always our parent, so I'm fine with whatever.
Maybe --notify-socket-host and --notify-socket-destdir or something - that can get rid of the hardcoded path of /run/notify/notify.sock in the container, and instead have it be (notify-socket-destdir)/notify.sock in the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I like that idea. then it can be exposed via cli on podman's side if the user wants to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we include the commit 52546c919be261e74847981b4db1227537c47345
into the commit that first introduced the change?
Also, I think we need a way to disable the notify socket to be handled by conmon, there are cases when it is helpful to use the handling in the OCI runtime. For example when the user needs $RUNTIME start $CTR
to wait for the container to be ready.
sorry it took so long to finally review. all together it looks great. thank you very much for taking this on. I'd also like a review from @giuseppe |
Also FYI, even if the notify socket cmdline/env are passed.. right now podman mkdir's the notify folder because I have it set the mount label, and pass the mount label as a bind mount to the spec. If the calling process doesn't create that socket, but a NOTIFY_SOCKET is passed, it'll error with a cryptic error. It was just an order of operations thing - that the runtime spec and mounts get done before conmon is called. We could, instead, throw a nice error , like "notify socket passed but no folder created" or something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the patch. I think it is a great idea but I've not managed to get it to work:
$ NOTIFY_SOCKET=/run/user/1000/systemd/notify podman --conmon ~/src/conmon/bin/conmon run --rm -ti fedora sh -c 'printenv NOTIFY_SOCKET; systemd-notify --ready'
Error: container create failed (no logs from conmon): EOF
src/conmon.c
Outdated
@@ -84,6 +84,16 @@ int main(int argc, char *argv[]) | |||
exit(0); | |||
} | |||
|
|||
char *notify_socket_path = getenv("NOTIFY_SOCKET"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we include the commit 52546c919be261e74847981b4db1227537c47345
into the commit that first introduced the change?
Also, I think we need a way to disable the notify socket to be handled by conmon, there are cases when it is helpful to use the handling in the OCI runtime. For example when the user needs $RUNTIME start $CTR
to wait for the container to be ready.
@giuseppe you need goochjj/podman#1 |
@giuseppe re: commit, yes, I would squash this whole PR before merging. |
@giuseppe Re: letting OCI runtime do Notify - I'm ok with that, but I think it basically means extending the --sdnotify option in podman. from containers/podman#6693: --sdnotify conmon-only|container|none With "conmon-only", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI The "none" option does what it's always done in the past - passes NOTIFY_SOCKET This removes the need for hardcoded CID and PID files in the command line, and Perhaps "container" should be split. I'm not sure into what and what. Maybe "ociruntime" and "podman". Then we'd have the groundwork to tweak the options passed to conmon. The advantage of having conmon do the proxy is exactly to prevent CRUN/RUNC from blocking on start, to fix containers/podman#6688. |
@goochjj Can you rebase? Let's see about getting this moved forward and merged, I'd love to have this in Podman 2.1.x |
6a4784b
to
1fc2dbf
Compare
@haircommander @mheon What's the deal with the tests... |
f28 and f29 are EOL, and the images haven't been updated. It is on my list this week to update them. sorry for the inconvenience |
Heh no worries, as long as it's not a blocker :-D I did run |
hey @goochjj CI is (mostly) fixed, please rebase! |
1fc2dbf
to
6270ce7
Compare
@haircommander @goochjj Other then the static build fail above is this ready to go in? Would like to get this in to close a Podman issue. |
Yep, should be. |
I think there are some unresolved review comments from @giuseppe |
I'm sorry about the nitpicking @goochjj if you are done with carrying this PR, we can have someone else take over it. A lot relies on conmon and the testing is really subpar, so please excuse my anxiety |
Let me go through and see if I can verify the resolved things |
Resolved just about everything except the open "Should we use ENV or a command-line argument" |
e202333
to
388b405
Compare
@haircommander As far as I know I've completed all I need to - let me know if there's something I need to look at, or you want to further discuss ENV vs cmdline. I can squash the PR whenever you want. |
388b405
to
2eb7590
Compare
2eb7590
to
6628524
Compare
test for cri-o regressions: cri-o/cri-o#4187 |
hold this, there are some failures I need to investigate |
@haircommander Whats up with this one? |
Refactored all the conn_sock functionality to be more generic. It can deal with different types of sockets, stream vs dgram, and reuses all the same callbacks, shutdown and async functionality. Conmon creates a notify socket which podman bind-mounts into the container, and passes in via the spec's environment variables. Conmon relays the READY=1 signal. This is similar to what runc and crun do, but doing it in conmon and NOT passing NOTIFY_SOCKET to the OCI runtime allows us to start up properly without runc and crun blocking on the "start" command. It would also be trivial to add more proxied sockets, i.e. the /dev/log proof of concept I did would now be super easy, if we wanted to revisit that. Signed-off-by: Joseph Gooch <[email protected]>
0a36f41
to
28799f9
Compare
LGTM assuming cri-o/cri-o#4187 passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Refactored all the conn_sock functionality to be more generic. It can deal
with different types of sockets, stream vs dgram, and reuses all the same
callbacks, shutdown and async functionality.
Conmon creates a notify socket which podman bind-mounts into the container,
and passes in via the spec's environment variables. Conmon relays the
READY=1 signal. This is similar to what runc and crun do, but doing it in
conmon and NOT passing NOTIFY_SOCKET to the OCI runtime allows us to start
up properly without runc and crun blocking on the "start" command.
It would also be trivial to add more proxied sockets, i.e. the /dev/log
proof of concept I did would now be super easy, if we wanted to revisit that.
Signed-off-by: Joseph Gooch [email protected]